Skip to main content

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 331))

  • 2132 Accesses

Abstract

This paper proposes a word detecting method for document image using character models and word models to evaluate the features of single-character and between-character. First, the text line is segmented into several fragments. Second, the candidate character, which is generated by merging some consecutive fragments, will be identified to be the right one if it conforms to the query word character models. Third, the path search strategy is used to search the candidate words constructed with candidate characters. The word model is used to identify the matching cost. Our experimental results on a dataset of document images demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Huang, L., Yin, F., Chen, Q.H., Liu, C.L.: Keyword Spotting in Offline Chinese Handwritten Documents using a Statistical Model. In: 11th International Conference on Document Analysis and Recognition, Beijing, pp. 78–82 (2011)

    Google Scholar 

  2. Yin, F., Wang, Q.F., Liu, C.L.: Integrating Geometric Context for Text Alignment of Handwritten Chinese Documents. In: 12th International Conference on Frontier in Handwriting Recognition, Kolkata, pp. 7–12 (2010)

    Google Scholar 

  3. Rodriguez, J., Perronnin, F.: Local Gradient Histogram Features for Word Spotting in Unconstrained Handwritten Documents. In: 11th International Conference on Frontier in Handwriting Recognition, Montreal, pp. 7–12 (2008)

    Google Scholar 

  4. Liu, C.L.: Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 29(8), 1465–1469 (2007)

    Article  Google Scholar 

  5. Zhang, B., Sargur, S., Huang, C.: Word Image Retrieval using Binary Features. In: 11th Document Recognition and Retrieval, California, pp. 45–53 (2004)

    Google Scholar 

  6. Chan, J., Ziftci, C., Forsyth, D.: Searching Off-line Arabic Documents. In: 22th IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, pp. 1455–1462 (2006)

    Google Scholar 

  7. Oda, H., Kitadai, A., Onuma, M., Nakagawa, M.: A Search Method for On-line Handwritten Text employing Writingbox-free Handwriting Recognition. In: 9th International Conference on Frontier in Handwriting Recognition, Tokyo, pp. 545–550 (2004)

    Google Scholar 

  8. Liu, C.L., Sako, H., Fujisawa, H.: Effects of Model Structures and Training Regimes on Integrated Segmentation and Recognition of Handwritten Numeral Strings. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1395–1407 (2004)

    Article  Google Scholar 

  9. Blumenstein, M., Liu, X.Y., Verma, B.: An Investigation of the Modified Direction Feature for Cursive Character Recognition. Pattern Recognition 40(2), 376–388 (2007)

    Article  MATH  Google Scholar 

  10. Wen, Y., Lu, Y., Yan, J.Q., Zhou, Z.Y., von Deneen, K.M., Shi, P.F.: An Algorithm for License Plate Recognition applied to Intelligent Transportation System. IEEE Trans. Intelligent Transportation Systms 12(3), 830–845 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, X., Huang, Z., Wen, Y., Lu, Y. (2012). Word Detecting in Document Image Based on Two-Stage Model. In: Zhang, W., Yang, X., Xu, Z., An, P., Liu, Q., Lu, Y. (eds) Advances on Digital Television and Wireless Multimedia Communications. Communications in Computer and Information Science, vol 331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34595-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34595-1_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34594-4

  • Online ISBN: 978-3-642-34595-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics