State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction

Koo, Hyung Il; Cho, Nam Ik

doi:10.1007/978-3-642-15552-9_31

State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction

Hyung Il Koo¹⁹ &
Nam Ik Cho¹⁹

Conference paper

5913 Accesses
15 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6312))

Abstract

This paper proposes a new approach to the estimation of document states such as interline spacing and text line orientation, which facilitates a number of tasks in document image processing. The proposed method can be applied to spatially varying states as well as invariant ones, so that general cases including images of complex layout, camera-captured images, and handwritten ones can also be handled. Specifically, we find CCs (Connected Components) in a document image and assign a state to each of them. Then the states of CCs are estimated using an energy minimization framework, where the cost function is designed based on frequency domain analysis and minimized via graph-cuts. Using the estimated states, we also develop a new algorithm that performs text block identification and text line extraction. Roughly speaking, we can segment an image into text blocks by cutting the distant connections among the CCs (compared to the estimated interline spacing), and we can group the CCs into text lines using a bottom-up grouping along the estimated text line orientation. Experimental results on a variety of document images show that our method is efficient and provides promising results in several document image processing tasks.

Download to read the full chapter text

Chapter PDF

References

O’Gorman, L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15, 1162–1173 (1993)
Article Google Scholar
Liang, J., DeMenthon, D., Doermann, D.: Flattening curved documents in images. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2005)
Google Scholar
Shafait, F., Breuel, T.M.: Document image dewarping contest. In: Int. Workshop on Camera-Based Document Analysis and Recognition, pp. 181–188 (2007)
Google Scholar
Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: International Workshop on Document Analysis Systems, pp. 209–216 (2008)
Google Scholar
Cao, H., Ding, X., Liu, C.: A cylindrical surface model to rectify the bound document. In: International Conference on Computer Vision, ICCV (2003)
Google Scholar
Koo, H.I., Kim, J., Cho, N.I.: Composition of a dewarped and enhanced document image from two view images. IEEE Trans. Image Process. 18, 1551–1562 (2009)
Article Google Scholar
Shafait, F., Keysers, D., Breuel, T.M.: Performance evaluation and benchmarking of six page segmentation algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30, 941–954 (2008)
Article Google Scholar
Zheng, Y., Li, H., Doermann, D.: Machine printed text and handwriting identification in noisy document images. IEEE Trans. Pattern Anal. Mach. Intell. 26, 337–353 (2004)
Article Google Scholar
Xiao, Y., Yan, H.: Text region extraction in a document image based on the delaunay tessellation. Pattern Recognition 36, 799–809 (2003)
Article MATH Google Scholar
Kise, K., Iwata, M.: Segmentation of page images using the area voronoi diagram. Computer Vision and Image Understanding 70, 370–382 (1998)
Article Google Scholar
Bukhari, S.S., Shafait, F., Breuel, T.M.: Coupled snakelet model for curled textline segmentation of camera-captured document images. In: International Conference on Document Analysis and Recognition, pp. 61–65 (2009)
Google Scholar
Lindeberg, T.: Feature detection with automatic scale selection. International Journal of Computer Vision 30, 79–116 (1998)
Article Google Scholar
Yin, F., Liu, C.L.: Handwritten chinese text line segmentation by clustering with distance metric learning. Pattern Recogn. 42, 3146–3157 (2009)
Article MATH Google Scholar
de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry. Springer, Heidelberg (2000)
MATH Google Scholar
Antonacopoulos, A.: Page segmentation using the description of the background. Computer Vision and Image Understanding 70, 350–369 (1998)
Article Google Scholar
Bukhari, S., Shafait, F., Breuel, T.: Segmentation of curled textlines using active contours. In: The Eighth IAPR International Workshop on Document Analysis Systems, DAS 2008, pp. 270–277 (2008)
Google Scholar
Li, Y., Zheng, Y., Doermann, D., Jaeger, S.: Script-independent text line segmentation in freestyle handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1313–1329 (2008)
Article Google Scholar
Pilu, M., Pollard, S.: A light-weight text image processing method for handheld embedded cameras. In: BMVC (2002)
Google Scholar
Pogalin, E., Smeulders, A., Thean, A.: Visual quasi-periodicity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)
Article Google Scholar
Gatos, B., Antonacopoulos, A., Stamatopoulos, N.: Handwriting segmentation contest. In: International Conference on Document Analysis and Recognition, vol. 2, pp. 1284–1288 (2007)
Google Scholar
Bukhari, S.S., Breuel, T.M., Shafait, F.: Textline information extraction from grayscale camera-captured document images. In: IEEE International Conference on Image Processing (ICIP), pp. 2013–2016 (2009)
Google Scholar
Dey, P., Noushath, S.: e-pcp: A robust skew detection method for scanned document images. In: Pattern Recognition (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

INMC, Dept. of EECS, Seoul National University,
Hyung Il Koo & Nam Ik Cho

Authors

Hyung Il Koo
View author publications
You can also search for this author in PubMed Google Scholar
Nam Ik Cho
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koo, H.I., Cho, N.I. (2010). State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15552-9_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-15552-9_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15551-2
Online ISBN: 978-3-642-15552-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics