A Novel Word Spotting Algorithm Using Bidirectional Long Short-Term Memory Neural Networks

  • Volkmar Frinken
  • Andreas Fischer
  • Horst Bunke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5998)

Abstract

Keyword spotting refers to the process of retrieving all instances of a given key word in a document. In the present paper, a novel keyword spotting system for handwritten documents is described. It is derived from a neural network based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e. it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm. We demonstrate that such a system has the potential for high performance. For example, a precision of 95% at 50% recall is reached for the 4,000 most frequent words on the IAM offline handwriting database.

References

  1. 1.
    Vinciarelli, A.: A Survey On Off-Line Cursive Word Recognition. Pattern Recognition 35(7), 1433–1446 (2002)MATHCrossRefGoogle Scholar
  2. 2.
    Plamondon, R., Srihari, S.N.: On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey. IEEE Transaction on Pattern Analysis and Machine Intelligence 22(1), 63–84 (2000)CrossRefGoogle Scholar
  3. 3.
    Levy, S.: Google’s two revolutions, Newsweek (December 27/January 3, 2004)Google Scholar
  4. 4.
    Kołcz, A., Alspector, J., Augusteijn, M.F., Carlson, R., Popescu, G.V.: A Line-Oriented Approach to Word Spotting in Handwritten Documents. Pattern Analysis and Applications 3, 153–168 (2000)CrossRefGoogle Scholar
  5. 5.
    Manmatha, R., Rath, T.M.: Indexing of Handwritten Historical Documents - Recent Progress. In: Symposium on Document Image Understanding Technology, pp. 77–85 (2003)Google Scholar
  6. 6.
    Rath, T.M., Manmatha, R.: Word Image Matching Using Dynamic Time Warping. Computer Vision and Pattern Recognition 2, 521–527 (2003)Google Scholar
  7. 7.
    Ataer, E., Duygulu, P.: Matching Ottoman Words: An Image Retrieval Approach to Historical Document Indexing. In: 6th Int’l. Conf. on Image and Video Retrieval, pp. 341–347 (2007)Google Scholar
  8. 8.
    Leydier, Y., Lebourgeois, F., Emptoz, H.: Text Search for Medieval Manuscript Images. Pattern Recognition 40, 3552–3567 (2007)MATHCrossRefGoogle Scholar
  9. 9.
    Srihari, S.N., Srinivasan, H., Huang, C., Shetty, S.: Spotting Words in Latin, Devanagari and Arabic Scripts. Indian Journal of Artificial Intelligence 16(3), 2–9 (2006)Google Scholar
  10. 10.
    Zhang, B., Srihari, S.N., Huang, C.: Word Image Retrieval Using Binary Features. In: Proceedings of the SPIE, vol. 5296, pp. 45–53 (2004)Google Scholar
  11. 11.
    Edwards, J., Whye, Y., David, T., Roger, F., Maire, B.M., Vesom, G.: Making Latin Manuscripts Searchable using gHMM’s. In: Advances in Neural Information Processing Systems (NIPS), vol. 17, pp. 385–392. MIT Press, Cambridge (2004)Google Scholar
  12. 12.
    Cao, H., Govindaraju, V.: Template-free Word Spotting in Low-Quality Manuscripts. In: 6th Int’l. Conf. on Advances in Pattern Recognition (2007)Google Scholar
  13. 13.
    Marti, U.V., Bunke, H.: The IAM-Database: An English Sentence Database for Offline Handwriting Recognition. Int’l. Journal on Document Analysis and Recognition 5, 39–46 (2002)MATHCrossRefGoogle Scholar
  14. 14.
    Marti, U.V., Bunke, H.: Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System. Int’l. Journal of Pattern Recognition and Artificial Intelligence 15, 65–90 (2001)CrossRefGoogle Scholar
  15. 15.
    Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A Novel Connectionist System for Unconstrained Handwriting Recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence 31(5), 855–868 (2009)CrossRefGoogle Scholar
  16. 16.
    Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist Temporal Classification: Labelling Unsegmented Sequential Data with Recurrent Neural Networks. In: 23rd Int’l. Conf. on Machine Learning, pp. 369–376 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Volkmar Frinken
    • 1
  • Andreas Fischer
    • 1
  • Horst Bunke
    • 1
  1. 1.Institute of Computer Science and Applied MathematicsUniversity of BernBernSwitzerland

Personalised recommendations