Abstract
This paper proposes a word detecting method for document image using character models and word models to evaluate the features of single-character and between-character. First, the text line is segmented into several fragments. Second, the candidate character, which is generated by merging some consecutive fragments, will be identified to be the right one if it conforms to the query word character models. Third, the path search strategy is used to search the candidate words constructed with candidate characters. The word model is used to identify the matching cost. Our experimental results on a dataset of document images demonstrate the effectiveness of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Huang, L., Yin, F., Chen, Q.H., Liu, C.L.: Keyword Spotting in Offline Chinese Handwritten Documents using a Statistical Model. In: 11th International Conference on Document Analysis and Recognition, Beijing, pp. 78–82 (2011)
Yin, F., Wang, Q.F., Liu, C.L.: Integrating Geometric Context for Text Alignment of Handwritten Chinese Documents. In: 12th International Conference on Frontier in Handwriting Recognition, Kolkata, pp. 7–12 (2010)
Rodriguez, J., Perronnin, F.: Local Gradient Histogram Features for Word Spotting in Unconstrained Handwritten Documents. In: 11th International Conference on Frontier in Handwriting Recognition, Montreal, pp. 7–12 (2008)
Liu, C.L.: Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 29(8), 1465–1469 (2007)
Zhang, B., Sargur, S., Huang, C.: Word Image Retrieval using Binary Features. In: 11th Document Recognition and Retrieval, California, pp. 45–53 (2004)
Chan, J., Ziftci, C., Forsyth, D.: Searching Off-line Arabic Documents. In: 22th IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, pp. 1455–1462 (2006)
Oda, H., Kitadai, A., Onuma, M., Nakagawa, M.: A Search Method for On-line Handwritten Text employing Writingbox-free Handwriting Recognition. In: 9th International Conference on Frontier in Handwriting Recognition, Tokyo, pp. 545–550 (2004)
Liu, C.L., Sako, H., Fujisawa, H.: Effects of Model Structures and Training Regimes on Integrated Segmentation and Recognition of Handwritten Numeral Strings. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1395–1407 (2004)
Blumenstein, M., Liu, X.Y., Verma, B.: An Investigation of the Modified Direction Feature for Cursive Character Recognition. Pattern Recognition 40(2), 376–388 (2007)
Wen, Y., Lu, Y., Yan, J.Q., Zhou, Z.Y., von Deneen, K.M., Shi, P.F.: An Algorithm for License Plate Recognition applied to Intelligent Transportation System. IEEE Trans. Intelligent Transportation Systms 12(3), 830–845 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Huang, Z., Wen, Y., Lu, Y. (2012). Word Detecting in Document Image Based on Two-Stage Model. In: Zhang, W., Yang, X., Xu, Z., An, P., Liu, Q., Lu, Y. (eds) Advances on Digital Television and Wireless Multimedia Communications. Communications in Computer and Information Science, vol 331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34595-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-34595-1_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34594-4
Online ISBN: 978-3-642-34595-1
eBook Packages: Computer ScienceComputer Science (R0)